691 research outputs found

    Knowledge discovery with CRF-based clustering of named entities without a priori classes

    Get PDF
    International audienceKnowledge discovery aims at bringing out coherent groups of entities. It is usually based on clustering which necessitates defining a notion of similarity between the relevant entities. In this paper, we propose to divert a supervised machine learning technique (namely Conditional Random Fields, widely used for supervised labeling tasks) in order to calculate, indirectly and without supervision, similarities among text sequences. Our approach consists in generating artificial labeling problems on the data to reveal regularities between entities through their labeling. We describe how this framework can be implemented and experiment it on two information extraction/discovery tasks. The results demonstrate the usefulness of this unsupervised approach, and open many avenues for defining similarities for complex representations of textual data

    Anther-specific carbohydrate supply and restoration of metabolically engineered male sterility

    Get PDF
    Male-sterile plants are used in hybrid breeding as well as for gene confinement for genetically modified plants in field trials and agricultural production. Apart from naturally occurring mutations leading to male sterility, biotechnology has added new possibilities for obtaining male-sterile plants, although so far only one system is used in practical breeding due to limitations in propagating male-sterile plants without segregations in the next generation or insufficient restoration of fertility when fruits or seeds are to be harvested from the hybrid varieties. Here a novel mechanism of restoration for male sterility is presented that has been achieved by interference with extracellular invertase activity, which is normally specifically expressed in the anthers to supply the developing microspores with carbohydrates. Microspores are symplastically isolated in the locular space of the anthers, and thus an unloading pathway of assimilates via the apoplasmic space is mandatory for proper development of pollen. Antisense repression of the anther-specific cell wall invertase or interference with invertase activity by expressing a proteinacious inhibitor under the control of the anther-specific invertase promoter results in a block during early stages of pollen development, thus causing male sterility without having any pleiotropic effects. Restoration of fertility was successfully achieved by substituting the down-regulated endogenous plant invertase activity by a yeast invertase fused to the N-terminal portion of potato-derived vacuolar protein proteinase II (PiII–ScSuc2), under control of the orthologous anther-specific invertase promoter Nin88 from tobacco. The chimeric fusion PiII–ScSuc2 is known to be N-glycosylated and efficiently secreted from plant cells, leading to its apoplastic location. Furthermore, the Nin88::PiII-ScSuc2 fusion does not show effects on pollen development in the wild-type background. Thus, such plants can be used as paternal parents of a hybrid variety, thereby the introgression of Nin88::PiII-ScSuc2 to the hybrid is obtained and fertility is restored. In order to broaden the applicability of this male sterility/restoration system to other plant species, a phylogenic analysis of plant invertases(β-fructofuranosidases) and related genes of different species was carried out. This reveals a specific clustering of the cell wall invertases with anther-specific expression for dicotyl species and another cluster for monocotyl plants. Thus, in both groups of plants, there seems to be a kind of co-evolution, but no recent common ancestor of these members of the gene family. These findings provide a helpful orientation to classify corresponding candidate genes in further plant species, in addition to the species analysed so far (Arabidopsis, tobacco, tomato, potato, carrots, rice, and wheat)

    Combined SVM-CRFs for Biological Named Entity Recognition with Maximal Bidirectional Squeezing

    Get PDF
    Biological named entity recognition, the identification of biological terms in text, is essential for biomedical information extraction. Machine learning-based approaches have been widely applied in this area. However, the recognition performance of current approaches could still be improved. Our novel approach is to combine support vector machines (SVMs) and conditional random fields (CRFs), which can complement and facilitate each other. During the hybrid process, we use SVM to separate biological terms from non-biological terms, before we use CRFs to determine the types of biological terms, which makes full use of the power of SVM as a binary-class classifier and the data-labeling capacity of CRFs. We then merge the results of SVM and CRFs. To remove any inconsistencies that might result from the merging, we develop a useful algorithm and apply two rules. To ensure biological terms with a maximum length are identified, we propose a maximal bidirectional squeezing approach that finds the longest term. We also add a positive gain to rare events to reinforce their probability and avoid bias. Our approach will also gradually extend the context so more contextual information can be included. We examined the performance of four approaches with GENIA corpus and JNLPBA04 data. The combination of SVM and CRFs improved performance. The macro-precision, macro-recall, and macro-F1 of the SVM-CRFs hybrid approach surpassed conventional SVM and CRFs. After applying the new algorithms, the macro-F1 reached 91.67% with the GENIA corpus and 84.04% with the JNLPBA04 data

    Measurement of the cross-section of high transverse momentum vector bosons reconstructed as single jets and studies of jet substructure in pp collisions at √s = 7 TeV with the ATLAS detector

    Get PDF
    This paper presents a measurement of the cross-section for high transverse momentum W and Z bosons produced in pp collisions and decaying to all-hadronic final states. The data used in the analysis were recorded by the ATLAS detector at the CERN Large Hadron Collider at a centre-of-mass energy of √s = 7 TeV;{\rm Te}{\rm V}andcorrespondtoanintegratedluminosityof and correspond to an integrated luminosity of 4.6\;{\rm f}{{{\rm b}}^{-1}}.ThemeasurementisperformedbyreconstructingtheboostedWorZbosonsinsinglejets.ThereconstructedjetmassisusedtoidentifytheWandZbosons,andajetsubstructuremethodbasedonenergyclusterinformationinthejetcentreofmassframeisusedtosuppressthelargemultijetbackground.ThecrosssectionforeventswithahadronicallydecayingWorZboson,withtransversemomentum. The measurement is performed by reconstructing the boosted W or Z bosons in single jets. The reconstructed jet mass is used to identify the W and Z bosons, and a jet substructure method based on energy cluster information in the jet centre-of-mass frame is used to suppress the large multi-jet background. The cross-section for events with a hadronically decaying W or Z boson, with transverse momentum {{p}_{{\rm T}}}\gt 320\;{\rm Ge}{\rm V}andpseudorapidity and pseudorapidity |\eta |\lt 1.9,ismeasuredtobe, is measured to be {{\sigma }_{W+Z}}=8.5\pm 1.7$ pb and is compared to next-to-leading-order calculations. The selected events are further used to study jet grooming techniques

    Search for direct production of charginos and neutralinos in events with three leptons and missing transverse momentum in √s = 7 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for the direct production of charginos and neutralinos in final states with three electrons or muons and missing transverse momentum is presented. The analysis is based on 4.7 fb−1 of proton–proton collision data delivered by the Large Hadron Collider and recorded with the ATLAS detector. Observations are consistent with Standard Model expectations in three signal regions that are either depleted or enriched in Z-boson decays. Upper limits at 95% confidence level are set in R-parity conserving phenomenological minimal supersymmetric models and in simplified models, significantly extending previous results

    Observation of associated near-side and away-side long-range correlations in √sNN=5.02  TeV proton-lead collisions with the ATLAS detector

    Get PDF
    Two-particle correlations in relative azimuthal angle (Δϕ) and pseudorapidity (Δη) are measured in √sNN=5.02  TeV p+Pb collisions using the ATLAS detector at the LHC. The measurements are performed using approximately 1  μb-1 of data as a function of transverse momentum (pT) and the transverse energy (ΣETPb) summed over 3.1<η<4.9 in the direction of the Pb beam. The correlation function, constructed from charged particles, exhibits a long-range (2<|Δη|<5) “near-side” (Δϕ∼0) correlation that grows rapidly with increasing ΣETPb. A long-range “away-side” (Δϕ∼π) correlation, obtained by subtracting the expected contributions from recoiling dijets and other sources estimated using events with small ΣETPb, is found to match the near-side correlation in magnitude, shape (in Δη and Δϕ) and ΣETPb dependence. The resultant Δϕ correlation is approximately symmetric about π/2, and is consistent with a dominant cos⁡2Δϕ modulation for all ΣETPb ranges and particle pT

    Search for direct pair production of the top squark in all-hadronic final states in proton-proton collisions at s√=8 TeV with the ATLAS detector

    Get PDF
    The results of a search for direct pair production of the scalar partner to the top quark using an integrated luminosity of 20.1fb−1 of proton–proton collision data at √s = 8 TeV recorded with the ATLAS detector at the LHC are reported. The top squark is assumed to decay via t˜→tχ˜01 or t˜→ bχ˜±1 →bW(∗)χ˜01 , where χ˜01 (χ˜±1 ) denotes the lightest neutralino (chargino) in supersymmetric models. The search targets a fully-hadronic final state in events with four or more jets and large missing transverse momentum. No significant excess over the Standard Model background prediction is observed, and exclusion limits are reported in terms of the top squark and neutralino masses and as a function of the branching fraction of t˜ → tχ˜01 . For a branching fraction of 100%, top squark masses in the range 270–645 GeV are excluded for χ˜01 masses below 30 GeV. For a branching fraction of 50% to either t˜ → tχ˜01 or t˜ → bχ˜±1 , and assuming the χ˜±1 mass to be twice the χ˜01 mass, top squark masses in the range 250–550 GeV are excluded for χ˜01 masses below 60 GeV

    Measurement of the production of a W boson in association with a charm quark in pp collisions at √s = 7 TeV with the ATLAS detector

    Get PDF
    The production of a W boson in association with a single charm quark is studied using 4.6 fb−1 of pp collision data at s√ = 7 TeV collected with the ATLAS detector at the Large Hadron Collider. In events in which a W boson decays to an electron or muon, the charm quark is tagged either by its semileptonic decay to a muon or by the presence of a charmed meson. The integrated and differential cross sections as a function of the pseudorapidity of the lepton from the W-boson decay are measured. Results are compared to the predictions of next-to-leading-order QCD calculations obtained from various parton distribution function parameterisations. The ratio of the strange-to-down sea-quark distributions is determined to be 0.96+0.26−0.30 at Q 2 = 1.9 GeV2, which supports the hypothesis of an SU(3)-symmetric composition of the light-quark sea. Additionally, the cross-section ratio σ(W + +c¯¯)/σ(W − + c) is compared to the predictions obtained using parton distribution function parameterisations with different assumptions about the s−s¯¯¯ quark asymmetry

    Measurement of χ c1 and χ c2 production with s√ = 7 TeV pp collisions at ATLAS

    Get PDF
    The prompt and non-prompt production cross-sections for the χ c1 and χ c2 charmonium states are measured in pp collisions at s√ = 7 TeV with the ATLAS detector at the LHC using 4.5 fb−1 of integrated luminosity. The χ c states are reconstructed through the radiative decay χ c → J/ψγ (with J/ψ → μ + μ −) where photons are reconstructed from γ → e + e − conversions. The production rate of the χ c2 state relative to the χ c1 state is measured for prompt and non-prompt χ c as a function of J/ψ transverse momentum. The prompt χ c cross-sections are combined with existing measurements of prompt J/ψ production to derive the fraction of prompt J/ψ produced in feed-down from χ c decays. The fractions of χ c1 and χ c2 produced in b-hadron decays are also measured
    corecore